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The  distributed  information  systems  area  has  seen  a  rapid  growth  in  terms  of 
research  interest  as  well  as  in  terms  of  practical  applications  in  the  past  three  years. 
Distributed  systems  are  becoming  a  reality,  however  truly  distributed  databases  are 
still  rare.  For  a  large  organization  with  a  distributed  computer  network  the  problem 
of  distributing  a  database  includes  determination  of; 

C-(%)  flow  can  the  database  be  split  into  components  to  be  allocated  to  distinct 
sites,  and 

C  /  ^  ji  ^How^nuch  of  the  data  should  be  replicated  and  how  should  the  repli¬ 
cated  fragments  be  allocated? 

In  this  paper  we  design  models  for  solving  both  of  the  above  problems—^ 

The  problems  of  database  distribution  have  been  attacked  earlier  By^csearch- 
ers,  but  we  perceive  two  serious  shortcomings  in  the  work  known  to  us.  Fir^ there 
is  a  body  of  work  on  file  allocation  which  considers  only  a  single  file  and  ignored  the 
complexity  introduced  by  the  interlinked  files  which  appear  in  realistic  databases. 
Second  there  are  models  which  consider  also  the  parallel  problem  of  network  topol¬ 
ogy  and  hence  deemphasize  the  data  distribution  problem.  The  topology  of  a  net¬ 
work  with  remote  sites  is  often  constrained  by  operational  considerations,  but  the 
capabilities  of  network  connections  arc  such  that  most  networks  can  be  reconfigured 


to  deal  well  with  any  known  load. 

Most  modern  networks  provide  at  least  on  the  logical  level  complete  connec¬ 
tivity  and  have  nodes  that  can  accomodate  multiple  files.  It  is  in  that  setting  that 
ou*  model  is  placed;  we  make  also  the  simplifying  assumption  that  the  unit  tn  ,-. 
mission  cost  is  the  same  among  any  two  nodes.  We  are  then  able  to  conccniio 
the  problem  of  distribution  of  multi-file  databases,  modelled  by  a  conceptual  >ncu«. 


of  connected  relations. 

Figure  1  provides  an  outline  of  an  overall  database  design  methodology  which 
is  consistent  with  previous  approaches  which  were  proposed  in  a  non-distributed 
database  cnvironmcnt(YaNW78,  LumA79j.  This  figure  is  included  to  define  the 
context  in  which  the  problem  is  being  solved.  We  assume  that  prior  to  undertaking 
the  distribution  of  the  database  the  following  activities  have  been  performed: 

•  The  overall  user  requirements  have  been  collected  and  analysed 

•  Individual  application  views  have  been  modelled  and  integrated  using 
some  formal  techniques  [e.g.  NaSc78,  WiEI80j. 
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The  specific  inputs  required  for  the  “distribution  design  phase” ,  obtained  from  the 
above  phases,  are 

1.  An  enterprise  schema  An  enterprise  schema  describes  the  global, 
canonical  model  of  the  information  structure  for  the  entire  database. 
The  schema  may  be  represented  by  listings  of  relations,  their  attributes 
and  domains',  and  definitions  of  connections  among  relations. 

2.  A  tabulation  of  transactions  and  their  volume  The  expected 
load  of  transactions  to  be  processed  using  the  distributed  database.  We 
assume  that  designers  using  the  proposed  methodology  will  be  able  to 
identify  important  transactions  and  give  a  complete  specification  for 
them.  The  success  of  the  “optimisation”  effort  is  largely  dependent  on 
how  accurately  and  completely  the  transactions  are  specified. 

3.  Distribution  requirements  This  refers  to  the  fact  that  users 
typically  have  a  good  understanding  of  how  they  would  like  to  partition 
certain  data  among  sites,  how  certain  parts  of  data  must  be  forced 
to  reside  at  the  same  site,  etc.  These  requirements  are  modeled  as 
constraints  in  our  formulation. 

With  the  above  inputs  from  users  and  designers  we  proceed  to  develop  an 
optimization  model  for  a  non-redundant  allocation  of  the  database  (Section  3).  To 
limit  the  proliferation  of  variables  we  have  made  the  following  simplifications:  It  is 
assumed  that  all  possible  ways  of  partitioning  of  an  object  arc  prespccificd  and  that 
the  model  would  cither  select  one  of  the  candidate  partitionings  or  allocate  an  object 
as  a  whole.  Secondly,  the  logical  access  paths  used  in  processing  a  transaction  are 
deterministically  specified.  The  latter  allows  us  to  focus  on  data  distribution  rather 
than  mixing  distribution  with  the  optimization  of  transaction  execution  itself. 

In  spite  or  the  above  simplifications,  the  size  or  the  problem  for  a  realistic 
database  (with  tens  of  sites  and  hundreds  of  data  entities)  would  still  involve 
thousands  of  variables  in  a  zero-one  integer  programming  formulation.  Since  current 
algorithms  are  good  only  for  solving  problems  of  the  order  of  60  to  100  variables,  it 
is  necessary  to  decompose  the  original  distribution  problem  into  subproblcms.  The 
decomposition  model  is  formulated  as  another  integer  program  (Section  4).  Finally, 
we  develop  a  heuristic  procedure  which  starts  off  with  a  given  non-redundant 
optimal  solution  and  determines  the  most  beneficial  replication  of  an  object  (Section 
5).  Section  6  includes  an  example  of  a  database,  a  set  of  transactions  for  it, 
and  demonstrates  how  the  non-redundant  optimisation  model  produces  different 
solutions  for  distribution  as  the  cost  parameters  and  frequencies  of  transactions  are 
varied. 


1.2  Previous  Related  Work 

As  mentioned  above,  the  previous  work  has  been  mainly  in  two  areas:  file  allocation 
and  network  topology  applied  to  databases  and  communication  networks. 

The  file  allocation  problem  was  first  investigated  by  Chu  (Chu69).  He  devel¬ 
oped  a  global  optimisation  model  to  minimise  overall  operating  costs  under  the 
constraints  of  response  time  and  storage  capacity  with  fixed  number  of  copies  of 
each  file.  The  integer  program  had  a  very  large  number  of  variables  for  even  small 


Sac.  1. 
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problems  and  was  computationally  infeasible.  Casey  [Case72]  relaxed  the  assump¬ 
tion  of  fixed  number  of  copies  and  stressed  the  difference  between  updates  and 
retrieval.  Whitney  [Whil70]  as  well  as  Casey  addressed  the  combined  problem  of 
file  allocation  and  communication  network  design  by  restricting  to  tree  topologies. 
Eswaran  [Eswa74]  proved  that  Casey’s  formulation  was  polynomially  complete, 
hence  he  suggested  that  heuristic  rather  than  deterministic  approaches  be  inves¬ 
tigated.  Several  studies  have  been  made  in  the  area  of  vertical  partitioning  and 
clustering  of  single  files  giving  rise  to  integer  programming  or  heuristic  approaches 
[HoSc75,IIo76,HaNi79].  In  the  present  paper  we  will  not  consider  vertical  partition¬ 
ing  of  the  database  objects  per  ae  since  in  the  distributed  environment  it  necessitates 
a  replication  of  keys  and  the  model  of  transaction  processing  becomes  too  complex. 

The  second  problem  category  has  been  explored  in  several  studies  with  different 
sets  of  assumptions  and  addressing  different  sets  of  parameters.  Mahmoud  and 
Riordon  [MaRi76]  considered  the  combined  problem  of  optimal  file  allocation  and 
channel  capacity  determination,  whereas  Morgan  and  Levin  [MoLe77]  examined 
both  the  allocation  of  files  and  programs  to  process  them  within  a  generalized  net¬ 
work.  By  ignoring  storage  capacity  constraints  and  inroducing  some  other  simplify¬ 
ing  assumptions,  Morgan  and  Levin  demonstrated  that  the  multiple  file  allocation 
problem  can  be  decomposed  into  single  file  allocation  problems.  Ramamoorthy  and 
Wah  [RaWa79]  analyzed  a  relational  distributed  database  for  optimization  of  query 
processing.  By  introducing  redundant  files,  they  showed  how  communication  costs 
attributed  to  joins  can  be  minimized.  Irani  and  Khabbaz  [IrKh79]  have  combined 
file  allocation,  network  topology  design  and  channel  capacity  allocation  into  a  single 
problem.  Their  model  minimizes  the  total  cost  of  file  storage  and  communication 
capacity  over  different  channels  under  the  constraints  of  a  minimum  level  of  network 
reliability,  minimum  availability  of  single  files  and  maximum  allowed  communica¬ 
tion  delays. 
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2.  PRELIMINARY  DEFINITIONS 

In  order  to  address  the  general  problem  of  distributed  database  design  and  to  de¬ 
velop  models  which  are  widely  applicable,  it  is  necessary  to  define  the  notions  of  a 
logical  database  schema,  occurrences  of  schema  constructs,  and  the  data  manipula¬ 
tion  operations  in  a  general  way.  The  concept  of  horizontal  partitioning  may  then  be 
applied  to  the  individual  constructs  of  a  schema,  and  database  transactions  can  be 
described  using  a  small  number  of  manipulation  primitives.  We  summarize  in  this 
section  the  concepts  and  definitions  which  are  necessary  to  develop  the  subsequent 
optimization  and  heuristic  models.  A  more  detailed  discussion  of  the  issues  related 
to  the  modelling  of  logical  schemas,  transactions,  and  partitioning  is  presented  in 
[NaCW8l]. 

It  is  assumed  that  the  integration  of  user  views  has  already  been  done  and 
that  the  logical  schema  which  is  subjected  to  distribution  is  a  global  view  or  an 
enterprise  view.  (Sec  (ElWi79,LumA79]  for  details  on  views  and  their  integration.) 
In  our  model  of  the  logical  schema,  we  have  done  away  with  an  explicit  accounting 
of  the  semantics  of  various  relationships  whenever  possible,  since  all  semantics  do 
not  have  a  bearing  on  the  distribution  problem.  The  logical  schema  of  a  database 
is  modelled  as  a  directed  graph  with  objects  as  nodes  and  links  as  edges.  Objects 
represent  entities,  events,  things,  or  concepts  of  interest  to  a  community  of  users. 
The  links  represent  relationships  among  objects. 


2.1  The  logical  schema  model 

We  will  now  define  the  components  of  a  logical  schema,  namely  objects  and  links, 
which  are  needed  for  the  task  of  designing  a  distribution.  Included  in  the  discussion 
of  links  is  the  use  of  join  operations. 

Object:  An  object  is  a  BCNF  relation  [Codd74].  Each  object  has  a 
unique  primary  key.  A  non-key  column  in  an  object  typically  represents 
an  attribute  of  the  real  world  object  or  of  a  relationship.  An  object  has  a 
unique  name  and  index  i,  1  <  *  <  R.  An  object  instance  is  represented 
by  an  n-tuplc  from  the  object.  Upper  case  letters  0\ ,  Oa,  •  •  •  will  denote 
objects,  whereas  lower  case  letters  0i ,  03, . . .  will  denote  object  instances. 

Link:  A  link  represents  a  binary  relation  among  objects  and  specifies  an 
ordered  pair  of  objects.  A  link  is  described  by  an  index  h,  1  <  h  <  L, 
and  may,  optionally,  have  a  name.  The  following  functions  are  defined 
for  a  link  l, 

own  :  M  -*  I 

and 

memb :  U  -*  I,  . 

where  M  —  {1,2,3, ...L}  and  I  =  {1,2,3,. . .A} 

These  functions  return  the  index  of  the  owner  (member)  object,  given 
the  index  of  the  link. 

An  instance  of  a  link  l  =  (Oi,Oa)  owned  by  0\  is  an  ordered  pair 
<  01,03  >,  where  0\  6  0|,  and  03  €  Of. 


See.  2. 
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A  member  object  instance  participates  in  one  and  only  one  instance  of  a 
given  link.  In  the  graphic  notation,  the  link  is  directed  from  the  owner 
object  to  the  member  object. 

One  to  many  relationships  among  objects  are  modelled  directly  using  a 
link  which  associates  many  instances  or  the  member  object  with  a  single 
instance  of  the  owner  object.  Many  to  many  or  ”  n-ary”  relationships 
among  objects  (n  >  2)  are  modeled  by  means  of  an  “intersection  object” 
which  is  owned  by  several  owners  via  different  links.  Figure  2  shows 
some  examples  of  the  use  of  links. 

Join  Specification:  Each  link  h  has  an  associated  join  specification 
JSh  :  O,  X  Oj  —►Boolean,  with  *  =  own(h),j  —  mem[h). 

It  maps  pairs  of  object  instances  from  the  owner  and  the  member  of  the 
link  to  true  or  false  depending  on  whether  or  not  they  match  the  join 
specification.  For  ease  of  treatment,  we  restrict  the  join  specification  in 
the  following  discussion  to  the  equijoin  only. 

Informally,  JSh  is  the  conjunction  of  equiprcdicates  of  the  type  Oj.Ci  = 
Oj.Cj,  where  C<  and  Cy  are  attributes  from  objects  Oi  and  Oj  or 
columns  in  the  corresponding  relations. 

We  further  assume  that  the  join  specification  exhaustively  includes  those 
columns  which  constitute  the  primary  key  of  the  owner  object. 

The  above  idea  of  predefined  links  deserves  further  explanation.  We  recognise  in 
the  logical  schema  those  particular  equijoins  which 

a  involve  the  primary  key  of  the  owner  object  and  a  compatible  set  of  domains 
from  the  member  object. 

b  are  significant  on  the  basis  that  these  joins  will  be  heavily  used  by  transac¬ 
tions. 

The  existence  of  a  link  arises  due  to  some  real-world  relationship  which  exists 
between  the  objccts[Chcn78,  WiEI80j.  However,  in  going  from  a  high-level  semantic 
model  of  the  database  to  the  logical  schema  Tor  distribution  design,  some  simplifi¬ 
cations  due  to  the  following  events  are  likely: 

Unimportance:  Relationships  which  express  a  semantic  connection  among 
objects,  but  are  not  used  in  any  of  the  important  transactions  that  are  the  basis  for 
the  distribution  design,  may  be  eliminated. 

Unmodelled:  Relationships  which  express  a  connection  among  objects  that 
corresponds  to  a  join  other  than  the  type  of  equijoin  mentioned  in  (a)  above,  arc  not 
modeled  as  links.  The  set  of  permissible  transactions  however  is  not  constrained. 
These  joins  may  be  performed  by  some  transactions  and  give  rise  to  an  execution 
cost  which  considers  the  use  of  links  whenever  possible  and  is  otherwise  based  on 
processing  algorithms  which  do  not  require  links. 

Non-equijoins:  In  cases  where  several  types  of  joins  arc  possible  among  two 
objects,  if  a  link  is  shown  in  the  logical  schema  connecting  those  two  objects,  that 
link  is  used  only  for  the  equijoin.  All  other  joins  proceed  as  if  no  link  existed. 
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The  physical  realisation  of  a  link  may  take  several  forms  (e.g.  ,  an  index,  a 
pointer  array,  etc.).  If  there  is  a  link  among  objects  Department  and  Employee 
with  the  join  specification  Department. D*  =  Employee  .Df,  it  implies  that  given  a 
value  for  D#,  e.g.  ,  D#  =  372,  it  is  possible  to  access  all  instances  of  the  Employee 
object  having  the  D#  =  372  without  an  exhaustive  search. 

Quantitative  parameters 
Each  object  t  has: 

cardinality:  total  number  of  instances  of  that  object,  called  card(t). 
site:  total  length  of  an  object  instance  (tuple)  in  bytes,  called  stzc(t). 

Each  link  h  has: 

image:  the  fraction  of  the  instances  of  the  owner  object  participating  in  that 
link,  image(h). 

average  cardinality:  the  average  number  of  instances  of  the  member  object 
which  are  associated  with  an  owner  instance  which  participates  in  the 
link,  avcard(h). 

membership  ratio:  the  ratio  of  the  toal  number  of  instances  of  the  member 
object  to  the  total  number  of  instances  of  the  owner  object  for  the  link, 
ratio(h). 

The  following  relationships  exist: 

ratio(h)  —  image(h)  X  avcard(h) 

card(memb(h ))  =  card(own{h))  X  ratio{h) 

The  cardinality  of  an  object  is  shown  in  the  logical  schema  by  a  number  in  the  top 
right  corner  of  the  rectangle  representing  the  object.  The  image  and  the  average 
cardinality  of  the  link  arc  shown  at  the  head  and  the  tail  of  the  arrow  which 
represents  a  link  (see  Figure  3). 


2.2  Partitioning 

In  this  subsection  we  define  the  notion  of  primary  and  derived  horizontal  partition¬ 
ings.  These  definitions  are  formulated  with  the  following  implicit  assumptions: 

Knowledge  of  use  For  a  given  database,  it  is  expected  that  a  “user”  (this 
term  is  also  a  synonym  for  a  “group  of  users”  or  a  “team  of  designers"  etc.)  has 
a  good  understanding  of  the  data  objects  and  links  in  the  logical  schema  and  also 
knows  the  potential  uses  of  the  database  at  various  sites.  Such  knowledge  is  often 
obtained  during  the  integration  phase,  where  the  conceptual  database  model  has 
been  constructed  |ElWi79).  This  knowledge  may  be  used  in  defining  meaningful 
horizontal  partitionings  of  data  objects  and  the  allocation  of  partitions  to  various 
sites. 

Knowledge  of  linkage  The  user  is  further  knowledgeable  about  how  the 
partitioning  applied  to  one  object  may  be  "propagated”  to  other  objects  via  links. 
Propagation  here  implies  using  identical  criteria  for  partitioning  of  multiple  objects. 


Sec.  2. 
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Horizontal  Partitioning  The  horizontal  partition  of  an  object  is  a  subdivi¬ 
sion  of  the  instances  of  the  object  into  disjoint  subsets.  Each  such  subset,  called 
fragment,  has  the  same  attributes  as  the  original  object,  and  can  be  allocated  on  a 
particular  site  of  the  database  operational  system.  A  horizontal  partition  satisfies 
a  predicate  which  is  a  boolean  expression  made  up  of  clauses: 
<doiaainnane><operator><value  set> 

For  each  object  we  consider  two  types  of  horizontal  partitioning,  resulting  in  primary 
and  derived  partitions. 

Primary  Partitioning  A  primary  partitioning  p  of  an  object  defines  Np 
different  fragments  of  the  object  on  the  basis  of  Np  disjoint  predicates. 

Let  PRED[i ,  p)  be  the  set  of  N  predicates  for  partitioning  the  object  i  according  to 
partitioning  p.  Let  f(i,p,q)  be  the  gth  fragment  and  let  predipq  €  PRED(i,  p)  be  the 
predicate  which  defines  for  ( i,p,q )  the  gth  fragment  of  object  i  under  partitioning 
p.  Then: 


o  €  f(i,P,q)  «■+  predipq(o) 

Vo  €  0,3ig  |  predipq(o) 

The  allocation  of  fragments  to  the  sites  of  the  network  is  also  given  by  the  user.  This 
feature,  which  greatly  simplifies  the  optimization  model,  comes  from  the  fact  that, 
in  real  applications,  the  user  associates  candidate  fragments  to  allocation  sites  in  a 
natural  way;  only  because  of  this  association,  the  user  can  determine  the  potential 
convenience  of  a  given  partitioning. 

The  allocation  of  fragments  is  therefore  an  input  to  the  application;  as  it  will 
be  shown  later,  the  non-redundant  optimization  model  determines  which  of  the 
alternative  candidate  partitions  (if  any)  should  be  applied  to  an  object  which,  in 
turn,  determines  their  allocation. 

Derived  Partitioning  Derived  partitioning  is  the  concept  of  partitioning  an 
object  by  applying  the  set  of  partitioning  predicates  which  apply  to  another  object 
so  as  to  “derive”  the  partitions  of  the  former  object.  Once  a  partitioning  is  defined 
for  t,  the  objects  which  are  connected  via  a  link  with  this  object  t  become  candidates 
for  derived  partitioning.  Two  important  considerations  arc: 

appropriateness  of  a  derived  member  partitioning:  Given  the  partition¬ 
ing  of  an  owner  object,  it  is  always  possible  to  define  a  corresponding  derived  par¬ 
titioning  of  a  member  object  via  a  link.  Such  derived  partitioning  may  or  may  not 
be  appropriate.  Hence  the  .user  needs  to  confirm  whether  partionings  suggested  by 
the  dependency  model  arc  to  be  considered. 

desirability  of  a  derived  owner  partitioning:  Given  the  partitioning  of 
a  member  object,  it  is  not  necessarily  possible  to  define  a  corresponding  derived 
partitioning  of  an  owner  object  along  a  link.  A  horizontal  partitioning  can  only  be 
derived  if  object  instances  from  any  single  fragment  of  the  member  object  map  into 
only  one  fragment  of  the  owner  object.  In  this  case  the  derivation  of  the  partitioning 
is  feasible.  In  the  structural  model  this  condition  is  true  for  identity  and  ownership 
connections  whenever  the  partitioning  predicate  refers  to  the  ruling  part  (WiEI80). 
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The  user  must  categorically  state  any  such  derived  partitionings  which  are  not  only 
feasible  but  also  desirable. 

As  a  result  of  the  above  considerations  a  user  is  in  a  position  to  define  “a  hierar¬ 
chy  of  derived  partitionings”.  E.g.,  in  Figure  4,  a  given  database  has  the  possibility 
of  two  derived  partitioning  hierarchies  on  the  basis  of  a  primary  partitioning  of 
MED-DEPTS  cither  by  location  or  by  a  range  of  department  numbers.  These  hierar¬ 
chies  involve  a  propagation  of  partitioning  along  owner-member  links.  The  third 
hierarchy  is  based  on  a  partitioning  of  PROCEDURES  into  surgical  and  non-surgical 
procedures.  The  derivation  of  the  partitioning  along  PROCEDURE  to  MED-DEPTS  and 
MED-DEPTS  to  DOCTORS  proceeds  from  member  object  to  owner  object  and  is  feasible 
since  MED-DEPTS  and  DOCTORS  can  be  partitioned  into  two  corresponding  subsets. 
A  more  formal  definition  of  derived  partitioning  follows: 

For  a  given  object  i  and  its  particular  partitioning  p,  there  may  be  an  associated 
set  D(i,  p)  of  derived  partitionings.  An  clement  of  this  set  is  a  triple  as  follows: 

D{i,  p)  —  {  <  *1,  is,  h  >)  properties  i,  ii,  iii,  and  iv  hold} 

t‘i  is  the  object  from  which  the  partitioning  is  derived 
t'2  is  the  object  to  which  the  partitioning  is  applied 
h  is  the  link  via  which  the  partitioning  is  derived. 

i  (ti  =  own(h)  A  »a  =  memb(h ))  V ((*2  —  own(k)  A  »i  =  memb(h)) 

ii  For  an  object  instance  o  €  0%, 

o  €  /r(*2,p,g)  ~  predi,tPl1{o) 

*-*  3  <  »i,»2i  h  >e  D[i,p)  A 

o'  €  0„  |  JSh(o,o')\predil  P  '(o') 

iii  Object »  appears  at  least  once  in  the  first  cotum  of  P(i,  p), 
and  p  is  a  primary  partitioning  for  i 

iv  The  second  column  of  £(i,p)  does  not  contain  the  same  object  more 
than  once. 

Note  that  the  properties  i  and  ii  allow  one  to  start  with  the  object  *  on  which 
the  original  partition  is  defined  and  define  the  derived  partitioning  for  all  objects 
separated  from  *  by  the  distance  of  a  single  link.  The  process  may  be  repeated  to 
propagate  a  partitioning  to  nodes  further  apart  than  a  single  link. 

Parameters  Describing  Primary  Partitionings  Each  fragment  q  of  par¬ 
titioning  p  of  object «  has: 

allocationi  the  site  where  the  fragment  of  the  database  relation  is  potentially 
allocated,  called  aUoe{*,p,q) 

fraction:  the  ratio  between  Mr  number  of  1  stances  of  the  fragment  and  the 
total  number  of  instances  of  the  obj  "ailed  v«,p, q) 
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Parameter*  Describing  Derived  Partitionings  Each  fragment  q  of  an  ob¬ 
ject  *  which  has  a  derived  partitioning  according  to  partitioning  p  has  an  allocation 
and  a  fraction  as  before.  Allocations  are  preserved  through  the  join  derivation; 
therefore  it  is 


alloc(q,  i,  p)  =  alloc(q,  p)  |  3*,  h  |  t,  i,  h  €  V(i',  p) 

In  absence  of  a  more  precise  user  specification,  we  assume  that  join  predicates  do 
not  alter  the  fraction  of  primary  partitions;  therefore  we  have 

fr(q,i,p)  =  Mq,i',p)  |  3i,7i  1 i,i,H  6  P(t'.p) 


2.3  TRANSACTION  MODELING 

One  of  the  critical  inputs  to  the  distribution  design  process  is  the  specification  of 
the  overall  transaction  load  on  the  database.  It  is  assumed  that  the  users  have 
a  good  notion  of  the  important  transactions  that  will  run  against  the  database 
being  designed.  It  is  expected  that  at  least  the  more  important  transactions  on 
the  database  will  be  identified  and  specified  in  terms  of  the  proposed  method  of 
specification.  These  transactions  allow  an  estimation  of  the  total  volume  of  data 
being  accessed  and  transmitted.  The  optimization  model  gives  a  solution  which 
minimizes  the  transaction  processing  cost. 

Each  transaction  is  described  in  terms  of  the  following  four  basic  access  primi¬ 
tives  {TDA,  SDA,  TJA,  SJA}  which  are  similar  to  the  access  path  primitives 
proposed  by  Su  et  al  [SuLL81]. 

TDA,  Total  Direct  Access:  This  access  primitive  models  the  accessing  of 
all  instances  of  an  object  in  a  transaction.  For  an  object  i,  the  number  of  instances 
accessed  is  card(i). 

SDA,  Selective  Direct  Access:  This  access  primitive  models  the  accessing 
of  only  a  selected  subset  of  the  instances  of  an  object  in  a  transaction.  The  selection 
criterion  is  prespecified  in  the  transaction  and  the  selectivity,  or  the  number  of 
instances  selected  is  estimated  and  provided  by  the  user. 

TJA,  Total  Join  Access:  This  models  the  access  along  a  link,  either  to  the 
member  object  or  to  the  owner  object. 

owner  to  member:  Access  to  j  =  memb(h)  from  i  =  own(h)  via  link 
h,  given  that  N  instances  of  t  have  been  accessed.  Then  the  number 
of  instances  of  j  accessed  is  N  X  ratio(h). 
member  to  owner:  Access  to  j  =  oitm(h)  from  t  =  memb(h)  via  link 
L,  given  that  N  instances  of  t  have  been  accessed.  Then  the  number 
of  instances  of  j  accessed  is  N. 

SJA,  Selective  Join  Access:  This  models  the  access  along  a  link  in  either 
direction,  as  in  the  case  of  TJA  above.  However,  the  number  of  instances  of 
the  target  object  accessed  is  estimated,  and  provided  by  the  user.  It  reflects  the 
selections  made  on  the  accessed  object. 

We  have  defined  a  graphic  notation  to  go  along  with  the  above  4  access  primi¬ 
tives,  examples  arc  shown  in  Figure  5. 
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2.3.1  Transaction  Specification 

The  cost  parameters  in  the  optimized  non- redundant  model  of  distribution  are 
derived  from  the  complete  specification  of  all  important  transactions  known  to  be 
applied  to  the  proposed  database.  Figure  6  shows  a  graphic  description  of  the 
following  transaction  on  the  database  of  Figure  3. 

FIND  all  Employees  in  the  Departments  in  California 
who  have  Projects  which  need  Part#  =  7386. 

LIST  Emp#,  Dept#,  Proj#,  Budget#. 

A  user  interface  of  the  type  shown  in  Fig.  7  is  postulated  in  which  the  user  would 
essentially  trace  a  transaction  through  the  database  with  the  numbers  shown  in 
various  columns  and  till  in  the  table  number  of  instances  touched  by  the  transaction 
wherever  selective  accesses  arc  concerned.  The  system  could  then  compute  the 
“number  accessed”  by  filling  in  the  missing  numbers  in  that  column  and  also 
compute  the  total  size  of  data  transmitted  “along  the  links”. 


2.3.2  Transaction  Execution 

Different  distributed  systems  have  their  own  ways  of  implementing  transactions. 
The  model  of  transaction  execution  which  is  implicitly  assumed  in  the  above  dis¬ 
cussion  is  a  simple  model  as  follows: 

i  A  transaction  originates  at  a  specific  site.  The  transaction  has  entry  points 
corresponding  to  the  objects  which  it  accesses  first. 

ii  A  transaction  proceeds  serially  by  performing  TDA  or  SDA  of  objects 
depending  on  whether  or  not  a  selection  condition  is  imposed  on  objects.  At  each 
stage,  the  domains  which  are  required  to  construct  the  end  result  of  the  transaction 
arc  forwarded  to  the  next  object. 

iii  A  join  is  performed  whenever  a  link  is  traversed  by  a  transaction.  A  join 
causes  the  transmission  of  the  columns  which  are  a  part  of  the  join  specification  for 
the  fink  concerned.  The  size  of  the  join  columns  added  to  the  columns  required  for 
output  becomes  the  “size  of  tuple  transmitted”  in  the  specification  table. 

iv  The  “Partitioning  predicate-matched"  column  in  the  transaction  speci¬ 
fication  is  significant  to  determine  whether  a  transaction  has  a  built-in  selection 
predicate  which  matches  any  of  the  predefined  horizontal  partitioning  predicates. 
The  cost  of  the  execution  of  a  transaction  reduces  whenever  a  matched  predicate 
is  actually  selected  for  partitioning. 

v  The  retrieval  versus  update  of  an  object  has  no  impact  on  the  execution  of 
a  transaction  in  the  lion-redundant  distribution  of  data.  It  is,  however  considered 
important  during  the  analysis  of  redundant  distribution. 
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The  transaction  execution  model  assumes  that  the  user  has  a  good  “understanding” 
of  the  actual  execution  strategy  that  will  be  employed  by  the  system,  as  it  requires 
a  deterministic  description  of  the  transaction's  access  path.  This  is  quite  adequate 
for  most  of  procedural  database  languages;  it  can,  however,  be  critical  for  advanced 
relational  systems,  where  the  system’s  optimizer  m*\y  behave  in  a  different  way  than 
the  user  expects,  and  lead  to  unanticipated  executions.  In  defense  of  the  proposed 
approach,  it  can  be  argued  that: 

1  It  is  not  now  possible  to  combine  both  transaction  processing  optimisation 
and  database  distribution  optimization;  because  both  problems  are  very  hard  and 
query  processing  optimization  is  highly  dependent  in  the  features  of  the  particular 
system  considered.  lienee,  when  the  database  distribution  problem  alone  is  to  be 
solved,  a  simple  characterization  of  transaction  execution  cost  should  be  given. 

2  Any  strategy  determined  by  the  system  different  than  the  one  proposed  by 
the  user  can  only  increase  the  performances  of  the  system;  hence  the  proposed 
execution  strategy  can  be  considered  as  a  worst-case  estimation. 

3  The  database  distribution  phase  can  be  repeated  once  the  system  is  opera¬ 
tional,  with  belter  characterization  of  transaction  execution  strategics  and  measure¬ 
ments  of  system’s  loads. 

2.3.3  Transaction  Parameters 

The  following  parameters  arc  related  to  transactions  and  will  be  employed  in  the 
subsequent  formulation.  For  ease  of  referencing  they  are  aggregated  here: 

transaction  index  :  k,  1  <  k  <  T 

object  index  :  t,  1  <  »  <  R 

link  index  :  h,  1  <  h  <  L 

For  each  transaction  k,  the  following  are  defined: 

=  1  if  object  »'  is  the  entry  point  of  transaction 
=  0  otherwise. 

fi  =  1  if  object  t  is  the  final  object  accessed  during  transaction  k 
=  0  otherwise. 

u*  as  1  if  object  i  is  updated  during  transaction  k 
=  0  otherwise. 

m*p  =  1  if  transaction  k  matches  partitioning  predicate  p  for  object  i 
=  0  otherwise. 

o(k)  =  the  site  of  origin  of  transaction  k. 

f(k)  =  {»  |  fit  —  1}  is  the  set  of  final  objects  in  transaction  k. 
r*  =  number  of  instances  of  object  *  selected  by  transaction  k. 
t*  =  total  number  of  accesses  to  object  t  made  by  transaction  k. 

Note:  t*  >  r* 

a*  =  data  in  bytes  shipped  from  object  *  to  the  site  which  stores  the  next 
object  accessed  by  the  transactions,  or  returned  as  a  result  when  i 
is  the  final  object. 
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f*  *=  index  of  the  link  accessed  next  after  object  t. 
Next f  =  index  of  the  object  acceaeed  next  after  object  t. 
Note:  Next f  can  be  derived  from  if 
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3  THE  NON-REDUNDANT  DISTRIBUTION  OF  A  DATABASE 

In  this  section  a  model  for  the  non-redundant  distribution  of  a  database  is  presented. 
The  objective  of  this  model  is  to  allocate  the  objects  to  sites  either  wholly  or  in 
non-redundant  fragments  so  as  to  minimise  a  global  cost  function.  The  highlights 
of  the  model  are  the  following: 

au  One  of  the  possible  alternative  sites  for  the  allocation  of  each  object 
is  selected. 

b:  The  model  takes  into  account  access  costs  at  the  various  sites  and 
data  transmission  costs  between  nodes;  these  two  criteria  also  ensure 
that  the  optimal  solution  is  “good”  from  the  viewpoint  of  efficiency, 
c:  The  minimization  of  data  transmission  costs  is  obtained  by  allocating 
links  or  predefined  join  paths  in  such  a  way  that  the  joins  of  entire 
objects  or  between  partitioned  objects  can  be  performed  locally, 
d:  The  model  of  execution  strategy  for  a  given  “logical”  transaction 
is  deterministic,  and  in  fact  the  transaction  consists  of  a  sequence  of 
retrieval  or  update  actions  on  objects  and  of  link  traversal  (navigation 
in  the  database)  between  objects.  Having  a  non-redundant  model  allows 
evaluation  of  each  access  or  transmission  cost  on  objects  or  finks  in  an 
additive  rather  than  combinatorial  fashion.  A  linear  zero-one  formula¬ 
tion  is  possible,  involving  decision  variables  associated  with  possible  al¬ 
locations  of  each  object  and  of  each  fink. 


3.1  Variables 

We  defin  j  two  variables,  X ,  Y  to  describe  the  state  of  each  object,  and  two  variables, 
V,  W  to  describe  the  state  of  each  link. 

Xip  as  1  if  the  object  i  is  partitioned  according  to  partition  p,  either  in 
a  primary  or  in  a  derived  way 
=  0  otherwise. 

Yij  =  1  if  the  object  t  is  allocated  on  the  site  j  as  a  whole 
=  0  otherwise. 

W^p  =  1  if  fink  h  is  used  for  deriving  a  partition  in  the  hierarchy  of 
derived  partitions  of  partitioning  p,  and  own(h)  and  memb{h) 
are  both  partitioned  using  p,  either  in  a  primary  or  in  a  derived 
way 

—  0  otherwise. 

For  a  fink  /»,  there  are  as  many  W*,  variables  as  the  number  of  distinct 
possible  partitionings  that  are  derived  using  the  link. 

V^p  a=  1  if  link  h  is  local  to  site  j,  because  own(h)  and  mem(h)  are 
allocated  on  j  as  a  whole 
as  0  otherwise. 

For  a  link  h,  there  is  a  variable  for  every  potential  site  j  to  which 
that  fink  could  be  allocated. 
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3.2  Constraints 

Clearly,  the  decision  variables  are  constrained  in  order  to  be  consistent  with  their 
definition.  We  have: 

(1)  a  non-redundancy  constraint 

y]  Xip  +  ^  Yij  =  1  for  every  object  t 
p  i 

(2)  consistency  constraints  for  variables  W 

Wj,p  <  Xov,n(h)p  A  W),p  <  Xmm^)p  for  every  Whp  introduced  in  the  model 

(3)  consistency  constraints  for  variables  V 

Vhj  <  Y0vm(h)j  A  Vhj  <  for  every  Vhj  introduced  in  the  model 

Constraints  (2)  and  (3)  arc  effective  because  the  coefficients  in  the  goal  function 
formulation  for  V  and  W  variables  are  negative,  and  hence  these  variables  tend  to 
assume  the  value  1  in  the  solution  of  the  problem;  however  this  occurs  only  if  the 
X  and  Y  variables  of  both  the  owner  and  member  of  the  fink  to  which  V  and  W 
refer,  i.e.  those  on  the  r.h.s.  of  the  inequalities,  are  also  set  to  1. 


3.3  Cost  parameters 

Access  costs  are  proportional  to  the  number  of  object  instances  accessed  in  a  given 
node  at  a  given  site;  further  sophistication  is  not  possible,  as  the  physical  database 
design  at  local  sites  will  be  performed  in  a  later  phase;  a  similar  approach  is  taken 
in  [TeFr80]  for  the  logical  design  of  a  non-distributed  database. 

We  distinguish  between  following  types  of  cost  units: 

CLRj  :  unit  cost  for  local  retrieval  accesses 
CfAJj  :  unit  cost  for  local  update  accesses 
CRItj  :  unit  cost  for  remote  retrieval  accesses 
CRUj  :  unit  cost  for  remote  update  accesses 
The  above  four  combinations,  generated  by  retrieval  vs.  update  and  local  vs. 
remote  access,  am  reasonable  categories  since  they  incur  different  costs  attributed 
to  authorization,  concurrency,  recovery,  etc. 

Transmission  costs  are  proportional  to  the  actual  sizes  of  data  involved  in 
transmission,  and  not  on  the  source  and  destination  site.  It  seems  impossible 
to  give  more  sophisticated  models  as  it  is  very  difficult  to  predict  actual  costs 
between  pairs  of  sites.  The  cost  may  not  be  fixed  in  some  systems  because  of 
the  use  of  dynamic  routing  algorithms  for  transmissions  ( c.g .  in  ARPANET);  local 
nctworks(e.g.  ETHERNET)  are  accurately  modelled  in  this  way. 

We  have: 

TC:  unit  transmission  cost  between  any  pur  of  different  sites  ji  and 


PfQeWfi****  V*rv-«-  • 


See.  3 


The  Non-redundant  Distribution  of  a  Database 


3.4  Goal  function 

The  goal  function  takes  the  form: 


+  2  D'’Y'>  ~  E  *****  “  E  BkV^ 


where: 


C.r  :  cost  of  partitioning  object  i  according  to  the  partition  p 

Dij  :  cost  of  allocating  the  whole  object  »  on  node  j 

CiP  and  Dij  take  into  account  both  transmission  and  access  costs 

Akp  :  cost  of  transmissions  which  can  be  saved  because  of  the  use  of 
the  same  partitioning  criteria  p  on  the  owner  and  the  member  of  the 
link  h 

Bk  :  cost  of  transmissions  which  arc  saved  because  both  the  owner 
and  the  member  of  the  link  h  arc  stored  on  the  same  site  j  as  a  whole. 
Note  that  this  parameter  docs  not  depend  on  a  particular  site  j,  as 
uniform  transmission  costs  are  assumed. 


3.5  Coefficient  Evaluation 

The  four  coefficients  Cip,  Dij,  Akp,  and  Bk  specify  the  cost  of  the  system  operation 
and  have  to  be  determined  before  the  optimization  of  the  model  can  take  place. 

3.5.1  Coefficient  C.p  for  partitioned  objects 

The  coefficient  gives  the  cost  of  partitioning  an  object  t  according  to  the 
partitioning  p.  This  cost  is  in  turn  given  by  the  sum  of  an  access  component  CAip 
and  a  transmission  component  CT<P : 

Cip  ~  CAip  +  CTiP 

Both  parts  require  some  elaboration. 

A)  Access  Component  The  access  cost  component  due  to  partitioning  is 
composed  of  the  suin  of  costs  due  to  all  defined  and  relevant  transactions  in  this 
partitioning: 

CAip  =  Y,sCAipk 

k 

where  SCAip,  the  access  cost  for  a  single  transaction,  is  defined  as: 

9CAipk=  2  NRUjPj  CRUj  +  £  NLU^CLUj 

>**•(*)  »■■•(*) 

+  53  NRRip}  CRRj  +  53  NLRi»  CLRj 


*> 
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The  counts  NRU *w- ,  NLU*pj ,  NRR*pj ,  and  N LR\pj  are  respectively  the  number 
of  remote  updates,  local  updates,  remote  retrievals,  and  local  retrievals  for  trans¬ 
action  k  accessing  the  fragment  on  node  j  of  the  object  i  partitioned  according 
to  partitioning  p.  This  cost  parameter  applies  only  to  those  objects  which  can  be 
potentially  partitioned  and  is  evaluated  separately  for  every  candidate  partitioning 
P- 


nru;„.  =  ac!„ :u! 
nlu!„.  =  ac?„u! 
nrr*„  =  ac!„(i-u?) 
nlr!„.  =  ac?„(  1-u?) 

Let  AC*W  be  the  number  of  accesses  of  transaction  k  to  the  fragment  on  node 
j  of  the  object  *  partitioned  according  to  partition  p.  Recall  that  we  defined  a 
matching  parameter  m  so  that  m*p  =  1  if  transaction  k  “matches”  partitioning 
p,  i.e.  ,  it  selects  objects  which  are  potentially  allocated  by  the  partitioning  on  the 
originating  site  of  the  transaction.  Then  AC*pj  is  defined  for  some  q  which  satisfies 
the  allocation  as  follows. 

ACiPj  ~  if  =  0  fch®n  t i  fr[i,  p,  q)  |  alloc(i,  p,  q)  =  j 
else  if  j  —  o(fe)  then  t* 
else  0 

i.e.  if  transaction  *  doesn’t  match  with  partition  p,  a  uniform  distribution  of  accesses 
is  assumed  and  AC*p]  is  computed  as  a  fraction  of  the  total  accesses  proportional 
to  the  fragment  size;  otherwise,  in  case  of  a  match,  two  cases  arise, 

i:  the  transaction  is  issued  on  the  node  j  that  we  are  considering,  and 
in  this  case  all  the  accesses  are  made  there,  or 

ii:  the  transaction  is  issued  on  a  different  site,  and  in  this  case  we  have 
no  accesses. 

B)  Transmission  Component  The  transmission  component  of  the  cost 
coefficient  CVp  is  given  by  the  sum 

cTip  =  crip  +  CT% 

where  CT'ip  takes  into  account  transmissions  which  are  needed  for  performing  the 
join  operations  which  are  required  for  performing  the  query  or  update  and  CT"f 
takes  into  account  the  transmissions  of  the  results  from  the  allocation  sites  of  object 
i  (partitioned  according  to  p)  to  the  site  of  origin  of  transaction  o(k). 

The  communication  load  for  performing  the  joins  is 

crip  =  £  £  tc 

«|  a|own(fc)— t 

*Uoc{i,p,l)yt»(k) 

where  TRm  is  the  communication  load  per  link  and  object  fragment,  specifically 
the  number  of  bytes  per  unit  time  which  are  transmitted  along  link  h  for  accosting 
object «. 
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The  summation  r  determines  the  number  of  effective  frag* 

ments  of  object  t  partitioned  according  to  p,  i.e.  the  cardinality  of  the  set  of  network 
sites  where  partitioning  p  places  the  fragments.  Remember  that  partition  predicates 
will  match  instances  of  object  t  to  a  subset  of  the  network  sites,  and  is  the  car¬ 
dinality  of  such  a  subset.  The  condition  alloe(i ,  p,  q)  5^  o{k)  is  used  because  join 
information  has  to  be  transmitted  to  all  the  fragments  of  object  *  located  at  remote 
sites.  TC  is  the  cost  parameter  for  data  transmission. 

The  unit  communication  load  TRm  can  be  computed  from  the  transaction  k 
specifications  as  follows: 

k  Vt'|n«<(«')— *A 

i.e.  ,  the  instances  r*  ,  of  the  object  *'  which  precedes  *  in  the  access  path  of  the 
transaction  k  and  therefore  have  to  be  sent  to  object  t  are  weighted  by  their  rise 
;  here  h  connects  t1  to  t. 

Note  that  in  evaluating  the  transmission  volumes  the  coefficient  r*  is  used, 
since  it  is  assumed  that  restrictions  are  performed  before  transmitting  objects 
in  the  network;  in  evaluating  access  volumes,  the  total  access  coefficients  are 
used  instead,  because  instances  must  be  actually  accessed  before  the  evaluation  of 
restriction  predicates  on  them. 

Similarity,  the  cost  for  result  transmissions  is  obtained  by  summing  the  com¬ 
munication  load  for  the  results  of  each  transaction 

CT"P  =  £  SCTipk 

k 

where  SCT"pk  is  the  cost  for  the  transmission  of  the  result  from  a  single  transaction, 
defined  as 

SCTipk  =  £  (1-  «?)  r?  /r(i,  p,  q)  TC 

»Jf  !«€/(*)  A 
A 

•Jioc(<,p,«)— / 

i.e. ,  those  fragments  of  the  result  of  the  transaction  which  are  not  allocated  on  the 
same  site  as  the  site  of  origin  for  the  transaction,  are  transmitted  to  that  rite;  the 
object  t  must  belong  to  the  Bet  of  terminal  objects  for  that  transaction  (i  €  /(*)), 
and  the  type  of  access  must  be  retrieval. 

3.5.2  Coefficient  for  whole  objects 

This  coefficient  evalutes  the  cost  associated  with  objects  which  are  wholly  assigned 
to  a  particular  rite.  It  is  given  by  the  sum  of  an  access  component  DAy  and  a 
transmission  component 

Dij  =  DAij  +  DTij 

These  costs  can  again  be  cs  Juatcd.for  each  transaction. 
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A)  Access  Component  The  access  component  is  the  sum  of  the  access  costa 
of  angle  transactions  SDA^k 

DAij^^SDA^,, 

k 

and  the  access  cost  of  a  single  transaction  SDA+jk  >*  either  the  sum  S  of  the  local 
costs  if  o(k)  =  7  or  the  remote  costs. 

SDAijk  =  6jMk)NW$  CLUj  +  (1  -  6iMk))NRUki  CRUj 
+  6,Mk)NLR*  CLR ,  +  (1  -  6jMk))NRRS  CRR, 

The  factors  NWk,  NRUk,  NUt f,  NRR *  are  respectively  the  number  of  local 
updates,  remote  updates,  local  retrievals,  and  remote  retrievals  for  transaction  k 
accessing  object  t  stored  on  any  site  &s  a  whole. 

NRUk  =  ACk  uk 
NWk  =  ACk  nk 
NRRki  =  AC*  (1  -  u?) 

NLflf  =  ACki  (1  -  uf) 

In  this  computation  AC*  is  simply  the  number  of  accesses  made  by  transaction  k 
to  object *. 

ACk  =  t* 

B)  Transmission  Component  The  transmission  component  of  the  cost 
coefficient  DTij  is  given  similarly  to  the  component  for  CTij,  as  the  sum 

DTn  =  DTij  +  DT”a 

where  again  DT'^  takes  into  account  transmissions  which  are  needed  for  performing 
the  join  operations,  and  DT'-j  takes  into  account  the  transmission  of  the  result,  for 
retrieval  transactions,  from  the  node  j  where  the  object  i  is  stored  (t  being  one  of 
the  termination  objects  of  the  transaction’s  access  path)  to  the  rite  of  origin  o (k). 

Dr^  =  £  TRkiTC 

where  TRn%  and  TC  were  previously  introduced;  in  this  case,  the  join  information 
has  to  be  sent  to  only  one  rite. 

k 

where  the  cost  of  a  unit  transmission  of  the  result  from  a  single  transaction  SDTtyk 
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is  defined  as 


sorr  =  £  V-OrtTC 

*j|*6/(k)  v 

i.e.  the  results  of  the  transaction,  which  retrieves  some  information  from  the  object 
t  allocated  on  node  j,  have  to  be  transmitted  to  the  origin  o(k)  of  the  transaction, 
if  o(k)  ^  j. 

3.5.3  Coefficient  Ahp 

This  coefficient  evaluates  the  savings  in  performing  the  joins  using  link  h  when 
both  the  owner  and  the  member  objects  of  h  are  partitioned  according  to  the  same 
partition  p;  it  is: 


Ap  =  {TRho  »n(M  +  TRh  mcm  w)  np 
where  the  coefficients  77?*,  have  already  been  defined. 

3.5.4  Coefficient  B* 

This  coefficient  evaluates  the  savings  in  performing  the  joins  using  link  h  when  both 
the  owner  and  the  member  objects  of  h  are  stored  as  a  whole  on  the  same  site  j;  it 
is: 


f^ft  TRh  own(h)  "l"  TRh  mem(h) 


3.6  Additional  Constraints  for  Derived  Partitions 

In  some  cases,  it  may  be  necessary  to  model  an  additional  constraint.  Consider 
a  derived  partition  from  object  i'  to  object  i"  using  the  link  h  connecting  »'  and 
It  could  be  required  that  a  partition  be  induced  on  object  i"  in  the  solution 
if  and  only  if  the  same  partition  is  also  applied  to  the  object  i'  in  the  solution. 
As  an  example,  consider  the  case  of  a  candidate  partitioning  of  the  Department 
object  by  location  (for  instance,  North,  South,  East,  Vest)  and  of  a  candidate 
derived  partitioning  of  Eaployee  objects,  using  the  link  which  gives  the  department 
in  which  each  employee  works.  Assume  then  that  the  candidate  partitioning  is 
selected  for  Eaployee,  but  not  for  Department.  The  problems  which  arise  in  this 
case  arc  due  to  the  fact  that  the  employee  information  by  itself  is  not  sufficient 
to  determine  the  partition  and  the  site  where  the  record  belongs.  Therefore  the 
transaction  which  generates  a  new  employee  record  should  first  join  the  employee 
record  with  the  department  record  in  order  to  derive  the  corresponding  location, 
and  hence  determine  the  fragment  where  the  record  should  be  stored.  This  case  is 
different  from,  for  instance,  the  use  or  a  partitioning  criterion  on  the  department 
number,  which  is  the  key  field  of  the  department  object  and  also  appears  in  the 
employee  information  (hence,  the  fragmentation  criteria  can  be  deduced  without 
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joining  the  corresponding  relations).  In  conclusion,  it  is  left  to  the  designer  to 
evaluate  the  possibility  of  constraining  the  derivation  of  predicates  at  lower  levels 
of  the  derivation  hierarchy  to  be  the  same  as  the  predicate  at  higher  levels.  The 
constraint  that  models  this  fact  is  simply: 

(4)  Consistency  constraint  for  derived  partitions 

Xi»p  <  Xi>p  for  constrained  pairs  <  i' ,i"  >  in  the  derivation 

In  fact,  because  of  this  constraint,  the  optimization  model  can  be  simplified. 
The  variable  Wkp  becomes  useless  as  a  result  of  the  above  constraint,  because  if 
Xi»v  is  set  to  1  in  the  solution,  then  certainly  the  link  is  used  for  deriving  the 
partition  (compare  with  constraints  (2)  in  Sec.  3.2).  Hence  a  modified  model  can  be 
used  in  which: 

a:  no  Wj,p  variable  is  introduced  for  those  pairs  of  objects  which  are  constrained  in 
the  derivation, 

b:  the  coefficient  C,»p  is  computed  in  the  derived  model  as  the  difference  of  Ci»p 
and  Ahp  in  the  original  model,  since  the  savings  will  always  occur  ir  the  partitioning 
p  is  used  for  object  i", 

c:  the  constraint  (4)  is  introduced. 

3.6.1  Modelling  dependencies  between  objects 

Similar  constraints  can  be  used  for  modelling  other  kinds  of  dependencies  between 
objects.  Assume  that  object  o"  is  “semantically"  dependent  from  object  o'  (for 
instance,  o"  is  a  weak  entity  in  the  sense  of  [Chcn76],  or  or  o"  is  “externally” 
identified  from  o'  [Nava80],  or  the  link  between  o'  and  o"  is  an  ownership  connection 
in  the  structural  model  [WiEI80]).  This  dependency  might  force  the  access  to  object 
o'  whenever  o"  is  accessed,  and  in  this  case  the  designer  could  decide  to  force  object 
o"  to  have  the  same  allocation  of  object  o'.  As  above,  we  need  to  introduce  the 
constraints  (4’): 

(4’)  Dependency  Constraint 

Xi»p  <  Xi>p  A  y*"j  <  Y\'j  for  dependent  pairs  <  > 

and  the  possibility  of  simplifying  the  model  exists  along  similar  lines  as  in  the  above 
discussion.  Here  both  the  Vjy  and  W^p  would  be  eliminated. 
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4  COMPUTATIONAL  COMPLEXITY  AND  DECOMPOSITION 

The  computational  complexity  of  a  linear  zero-one  program  depends  roughly  on  the 
number  of  variables  which  are  involved. 

Let: 

N  :  the  number  of  sites 

NP{  :  the  number  of  candidate  partitions  for  object  * 

R  :  the  number  of  objects 

L  :  the  number  of  links 

NDh  :  the  number  of  partitions  which  are  derived  using  the  link  h, 

then  the  number  of  variables  in  the  model  is 

NVAR  -  N(R  +  L)  +  £  NPi  +  NDk 

i  k 

(there  arc:  NI\  variables  Xip  and  N  variables  Y,y  for  each  object,  ND^  variables 
Wip  and  N  variables  V<i  for  each  link) 

This  number  can  easily  become  too  large  for  integer  programming  solution 
methods;  hence  some  techniques  are  needed  for  decomposing  the  original  design 
problem  into  subproblems  which  are  computationally  feasible.  The  decomposition 
aims  at  determining  subsets  of  the  enterprise  schema  which  can  be  independently 
optimized;  it  is  dcsiderable  to  “cut”  the  model  into  subsets  by  snapping  the  links 
along  which  the  least  transmission  volumes  occur,  as  the  allocation  of  these  links 
will  not  be  optimized. 

A  model  for  determining  such  a  decomposition  of  the  problem  is  presented 
in  the  following;  the  aim  of  the  model  is  to  decompose  the  original  problem  into 
subproblcms  whose  dimension  is  big  enough  to  represent  meaningful  problems,  yet 
is  small  enough  to  be  computationally  feasible.  This  aspect  is  captured  by  one 
of  the  model  constraints,  which  limits  the  number  of  variables  belonging  to  the 
subproblem  between  a  lower  and  an  upper  bound. 


4.1  Decomposition  Model 

The  model  determines  a  subproblem  S  consisting  of  a  set  of  objects  and  links 
having  an  associated  number  of  decision  variables  for  the  non-redundant  allocation 
model  which  is  limited  between  a  lower  and  an  upper  bound.  An  optimal  set  S  is 
determined  by  minimizing  the  volume  of  transmissions  which  use  the  links  between 
objects  belonging  to  the  set  and  objects  outside  the  set. 

Variables  One  decision  variable  is  introduced  for  each  object  t,  and  two 
decision  variables  are  introduced  for  each  link  h. 

Xi  =  1  if  the  object  belongs  to  the  subproblem  S 
—  0  otherwise. 

Yk  =  1  if  the  link  h  connects  an  object  of  S  and  an  object  outside  S 
=  0  otherwise. 

Zh  —  1  if  the  link  h  connects  two  objects  of  S 
=  0  otherwise. 
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Goad  function  The  goal  function  for  the  decomposition  has  the  form: 

min  z  = 

k 

where  VV),  represent  the  transmission  requirements  along  the  link  h,  and  it  is: 

Wfc  =  Bfc  =  ^Rfc«w«(a)  +  TRltm  ,m(k) 

The  coefficient  TRu  is  introduced  in  Sect  3.5.1. 

Constraints  Size  constraints  are  introduced  to  assure  that  the  subproblems 
created  from  the  decomposition  phase  will  be  neither  too  large  nor  too  small.  There 
are  also  two  new  consistency  constraints. 

1)  Constraint  on  the  dimension  of  the  subproblem.  Let  LD  and  UB  represent 
the  lower  and  upper  bound  respectively  on  the  number  of  variables  that  are  included 
in  the  subproblem;  then: 

LB  <  ^2(N  +  NI\)Xi  +  ^2{N  +  NDk)zk  <  UB 

t  k 

2)  Consistency  constraint  for  the  variables  Each  Yj,  must  be  forced  to  1 
when  the  values  assumed  by  the  X  variables  of  the  owner  and  member  object  are 
different.  Otherwise  the  Y*  value  is  free,  and  because  of  the  positive  coefficient  in 
the  goal  function,  Yh  will  naturally  be  0. 


Yfc  ^  -Xswn(h)  -^mc*(k)  A  ^  -Ymcm(fc)  YBBB(k)i  1  5s  b  jC  L 

S)  Consistency  constraints  for  the  variables  Z^.  Each  Z^  must  be  forced  to  1 
when  both  owner  and  member  are  1 ,  but  must  be  forced  to  0  when  any  of  them  is 
0  (hence,  Z^  is  equal  to  the  product  Xtwn^)  Xmtm^))-  This  is  modelled  in  a  linear 
program  by  introducing  the  constraints: 

%k  d  "h  1|  1  ^  ^  ^ 
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5  DESIGN  OF  A  PARTIALLY  REPLICATED  DISTRIBUTION  OF 
THE  DATABASE 

The  introduction  of  redundancy  in  a  distributed  database  can  lead  to  important 
advantages  both  from  the  viewpoint  of  performance  and  reliability  of  the  system. 
The  introduction  of  redundancy  has  to  be  handled  with  great  care,  since  it  also  leads 
to  an  increase  in  the  complexity  of  the  distributed  database  management  software. 

Redundancy  exists  at  many  level  in  database  systems.  System  level  redundancy 
is  provided  by  data  encoding  and  logging  mechanisms,  with  the  sole  objective  of 
gaining  reliability.  Redundancy  in  databases  is  commonly  increased  by  the  use  of 
indexes  and  auxiliary  access  paths.  In  distributed  systems  especially  we  find  that 
sets  of  data  elements  arc  kept  redundantly  at  multiple  sites. 

The  potential  gain  in  performance  from  such  a  replication  is  due  to  the  fact  that 
any  of  the  copies  of  each  replicated  database  object  can  be  used  by  a  transaction 
for  a  retrieval  access,  provided  that  all  the  copies  are  consistent;  hence  several 
execution  strategies  can  be  used  for  accessing  objects,  decreasing  overall  execution 
costs.  The  improved  reliability  is  obviously  related  to  the  availability  of  several 
copies,  geographically  dispersed,  of  the  same  information.  The  increase  in  the 
complexity  of  database  management  is  mostly  due  to  the  need  of  maintaining  the 
consistency  of  the  replicated  copies  of  the  same  data  objects;  updates  have  therefore 
to  be  propagated  to  all  of  them. 

In  the  following,  we  will  describe  a  heuristic  technique  for  progressively  intro¬ 
ducing  redundancy  by  replication,  using  the  optimal  non-redundant  solution  as  a 
basis.  Assumptions  will  be  made  about  how  transactions  are  handled  in  the  repli¬ 
cated  environment,  aiming  to  give  an  execution  model  of  transactions  which  is  typi¬ 
cal  of  distributed  database  systems  which  employ  replication.  It  will  be  shown  that 
the  transaction  excution  model  is  inherently  combinatorial  because  of  replication, 
and  this  motivates  the  use  of  a  heuristic  allocation  algorithm. 


5.1  Assumptions  about  the  Distributed  Database  Environment 

In  order  to  analyse  replication  within  the  distributed  database  environent  we  have 
to  modify  and  extend  the  assumptions  made  in  the  previous  analysis: 

1:  The  non-redundancy  constraint  is  relaxed;  therefore,  it  is  possible  to  have 
several  different  allocations  of  the  same  data  object. 

2:  The  updates  are  immediately  propagated  to  all  the  copies  of  each  data 
object;  therefore  updates  to  objects  arc  directed  to  all  the  sites  where  the  objects 
are  replicated. 

3:  The  oplimiscr  which  determines  the  execution  strategy  of  transactions  has 
the  following  features: 

a:  It  has  a  global  knowledge  of  database;  global  directories  are  therefore 
available  at  each  site  where  transactions  are  optimised. , 
b:  It  selects  the  best  alternative  among  the  logically  equivalent  execution 
strategies  which  are  possible  for  retrieval  accesses;  this  choice  reflects 
the  same  criteria  which  are  used  in  the  optimisation  model,  namely,  the 
minimisation  of  access  and  transmission  costs. 


24 


Optimal  Design  of  Distributed  Databases 


4:  The  replicated  copies  arc  used  to  enhance  system  reliability  as  well,  and 
lessen  the  requirements  for  stable  storage  at  each  node  [MiWi81].  The  increase 
in  reliability  which  depends  on  the  presence  of  multiple  copies  can  be  taken  into 
account  by  associating  to  each  object  a  set  of  negative  cost  parameters,  each  of 
which  estimates  the  overall  benefit  which  is  a  function  of  the  number  of  copies  of 
the  object. 


5.2  Example  of  Transaction  Execution  over  a  Redundantly  Distributed 
Database 

Consider  a  simple  transaction  Tk  which  accesses  two  objects  Oy  and  O 2  which  have 
degrees  of  redundancy  dry  and  dr 2  respectively.  Consider  the  following  4  possible 
cases: 

case  a:  Both  Oy  and  0 j  are  retrieved  (u*  =  u\  —  0);  then  there  are 
dr  1  X  dr 2  possible  execution  strategies  for  the  transaction,  each  using 
one  particular  pair  of  copies  of  Oy  and  O 2.  The  cost  associated  with 
transaction  execution  is  the  minimum  cost  among  these  alternatives, 
case  b:  Oy  is  retrieved  and  02  is  updated  (uf  =  0,  ufj  =  1);  then  all 
the  copies  of  O2  are  accessed,  while  there  are  dry  alternative  execution 
strategics  for  accessing  one  of  the  copies  of  Oy.  The  information  which 
is  used  for  joining  Oy  and  O2  is  sent  from  the  selected  copy  of  Oy  to 
all  copies  of  02-  The  cost  asociated  with  transaction  execution  is  the 
minimum  cost  among  the  dry  alternatives, 
case  c:  Oy  is  updated  and  02  is  retrieved  (u*  =  1,  uk  —  0);  then  all 
the  copies  of  Oy  must  be  accessed,  but  only  one  copy  of  02  is  sufficient. 
The  information  wh'oh  is  used  for  joining  Oy  and  02  is  sent  from  the 
most  convenient  copy  of  Oy  to  the  selected  copy  of  02.  Again,  the 
cost  associated  with  transaction  execution  is  the  minimum  among 
alternatives. 

case  d:  Both  Oy  and  O 2  are  updated  (u*  =  tt2  =  1);  then  there  is  only 
one  execution  strategy  for  Tk,  consisting  of  accessing  all  the  copies  of 
both  Oy  and  02.  For  each  object  of  02,  the  information  which  is  used 
for  joining  Oj  and  02  is  sent  from  the  most  convenient  copy  of  Oy. 

In  the  following,  some  definitions  arc  given  which  are  useful  for  the  replicated 
optimization  model. 

Let  5  be  a  non-redundant  solution,  O  be  the  set  of  objects  of  the  database,  /  be 
a  subset  of  O.  The  set  SOL(S,  /)  contains  those  solutions  S'  generated  by  taking  all 
possible  ways  in  which  objects  in  /  can  be  non- redundantly  allocated  in  combination 
with  replicated  allocations  for  the  objects  in  0—1.  Therefore,  indicating  with  prime 
the  allocation  variables  of  S'  and  without  prime  the  allocation  variables  of  S,  the 
following  definition  can  be  given: 
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SOI^S,  /)  =  {5'  | 

V*  6  /,  ((3ip  |  Xip  =  X'ip  =  1  A  (Vp'  ^  p, X;y  =  0)  A  (Vj, yj,  =  0)) 
v  (3,  j  I  y0  =  y',  =  *.  A  (V/  ^  j,  Y'ir  =  0)  A  (Vp,  X'p  =  0))), 

V*  6(0-  /),  (*:„  =  xip  A  y;,  =  n,)} 

The  cardinality  of  SOL(S,  I)  is  a  function  of  the  degree  of  redundancy  of 
the  objects  in  /;  it  is  |SOL(S,/)|  ~  Il,'6/^r«* 

Finally,  we  can  say  that  two  solutions  S'  and  S"  differ  in  one  variable  V ,  or 
S'  —  S"  =  V,  when  X'ip  =  X"  and  Y'(  =  Y",-  for  all  variables  other  than  V,  and 
V'  =  1,  V"  =0. 

Given  a  solution  S,  it  is  possible  to  evaluate  the  transaction  execution  cost 
C(Tk,S)  as  the  minimum  of  the  set  of  alternative  transaction  execution  costs 
TEC(Tk,  S'),  where  S'  is  one  of  the  solutions  in  SOL{S,Ik)  and  /*  is  the  set  of  the 
objects  retrieved  by  the-  transaction.  We  have: 

C(Tk,  S)  =  min  TEC(Tk,S') 
s'esous,!*)  v 


Recalling  the  cases  of  Sect.  5.2,  the  set  of  retrieved  objects  is 


in  general  it  is 


Ik  =  {OuOt } 

in  case  a, 

ik  =  {t M 

in  case  b, 

ik  =  m 

in  case  c, 

/*=* 

in  case  d; 

0>. 

Figure  8  shows  the  for  the  4  cases  of  Sect.  5.2  the  accesses  of  transactions  and  the 
required  transmissions  in  terms  of  the  set  of  SOL(S,I),  given  a  solution  S. 


5.8  Description  of  the  Object  Allocation  for  the  Redundant  Database 
Distribution  Algorithm 

The  description  of  a  redundant  database  allocation  uses  the  same  variables  and 
Yii  that  were  introduced  in  the  non-redundant  model,  releasing  the  non-redundancy 
constraint.  Therefore  it  is  possible  now  to  allocate  an  object  according  to  several 
alternative  partitionings,  or  to  store  it  as  a  whole  on  several  partitionings  and  full 
allocations.  We  can  define  a  redundant  solution  S  as  the  assignment  of  0/1  values 
to  the  decision  variables  X%p  and  Yii,  subjected  to  the  constraint  that  each  object 
should  be  allocated  at  least  once  in  the  distributed  database;  the  non-redundancy 
constraint  is  therefore  modified  as  follows: 

>  l, 

w  i 


V*  <  R 
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5.4  Evaluation  of  Transaction  Execution  Cost 

In  this  section  the  transaction  execution  cost  TEC(Tk,  S')  for  a  given  solution  S'  of 
SOL(S,  Ik)  is  derived  from  the  logical  description  of  transaction  accesses,  introduced 
in  Section  2.3.  The  same  cost  parameters  as  in  the  non-redundant  model  are  used 
for  access  and  transmission  costs  (sec  Section  3.3). 

The  transaction  execution  cost  TEC(Tk,S')  comprises  2  components:  access 
costs  AC(Tk,S')  and  transmission  costs  TC(Tk,S'). 

Access  Cost  Let  U(k)  be  the  set  of  objects  used  by  transaction  k.  The  access 
cost  AC(Tk,S')  considers  essentially  the  same  components  as  in  the  non-redundant 
model,  but  now  aggregation  is  made  on  the  objects  for  a  fixed  transaction  instead 
of  aggregating  on  transactions  while  keeping  the  object  fixed.  It  is: 

AC(Tk,S')  =  £  (SCAipkXip  +  SDA^Yn) 

i€U(k) 

where  SCAipk  and  SDA{jk  were  defined  in  Secs.  3.5.1  and  3.5.2.  However,  Xip  and 
Yij  are  now  fixed  and  appear  in  the  cost  evaluation.  Also,  notice  that  there  can  be 
more  than  one  value  Xip  or  YtJ  set  to  1  for  the  objects  which  are  updated. 

Transmission  Cost  The  transmission  cost  TC(Tk,  S')  is  given,  as  before,  by 
the  sum  of  two  components.  The  first  one,  TC'(Tk,S'),  takes  into  account  those 
transmissions  which  are  required  for  performing  the  joins.  For  every  copy  of  object 
i  which  is  accessed  via  a  join  with  the  object  *'  (i.e.  for  every  pair  <*,*'>  such 
that i'  precedes  i  in  the  access  path  of  the  transaction),  a  transmission  is  required, 
unless  i'  has  the  same  allocation  as  ».  Therefore, 

a:  the  transmission  of  the  join  information  from  object  *'  to  all  the  fragments 
of  object  t  is  required  when  the  solution  S'  has  the  variable  X+p  set  to  1  and  X?  p 
set  to  0; 

b;  likewise,  the  transmission  of  the  join  information  from  object  »'  to  object 
i  is  required  when  the  solution  S'  has  the  variable  Y*  set  to  1  and  Y,»  y  set  to  0. 

We  have: 


TC'(Tk,  S')  =  2  (1  —  Xi'p)  Xip  Np  rk  tk  + 

N«*t*  (%')—»• 

53  (i  -  i»'j)  y%j  ri  **  tc 

JV**t*  (»')—* 

This  formulation  takes  care  of  the  minimisation  of  the  transmission  cost  when  *'  and 
t  are  allocated  according  to  the  same  partitioning,  or  on  the  same  site;  otherwise 
transmission  of  join  information  to  each  copy  which  is  retrieved  or  updated  is 
provided.  The  minimisation  of  transaction  execution  costs  which  accrue  to  the 
choice  or  one  particular  copy  of  each  retrieved  object  is  part  of  the  minimisation  of 
costs  associated  with  alternative  solutions  in  SOL^S,  Ik). 
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Note  the  similarity  between  TC'(Tk ,  S')  and  the  coeflicicnts  CVip  and  DT1^  of 
the  non-redundant  model;  here  again  X<p  and  y.j  are  fixed,  and  hence  the  use  of 
their  product  in  the  formulation  is  possible. 

The  second  component  of  the  transmission  cost,  TC"(Tk,  S'),  takes  into  ac¬ 
count  these  transmissions  which  arc  required  for  collecting  the  result  of  the  trans¬ 
action  on  the  site  of  the  transaction.  This  cost  is  evaluated  exactly  in  the  same 
way  as  in  the  non-redundant  model  origin,  as  it  involves  transmissions  to  the  origin 
site  from  those  sites  which  store  terminal  objects  of  the  transaction;  note  that,  as 
terminal  objects  are  accessed  for  retrieval,  they  are  not  redundantly  allocated.  We 
have: 


TC"(Tk,  S')  =  £  (SCT'lpk  Xip  +  SDT"ijk  *,)  TC 
k 

where  SCT"pk  and  5ZM"fc  were  introduced  in  Secs.  3.51  and  3.52. 


5.5  A  Greedy  Heuristic  Algorithm  for  the  Progressive  Introduction  of 
Redundancy 

A  greedy  heuristic  algorithm  for  the  progressive  introduction  of  redundancy,  using 
the  optimal  non-redundant  solution  as  a  basis,  is  shown  in  Fig.  9.  The  algorithm 
has  the  following  features: 

1  The  algorithm  is  iterative;  at  each  iteration,  the  solution  S  determined  at  the 
previous  iteration  is  taken  as  a  basis,  and  all  variables  V  which  have  value  0  in  that 
solution  are  tentatively  set  to  1,  generating  a  set  of  alternatives  solutions  S'  such 
that  S'  —  S  =  V.  Global  <.osts  are  then  evaluated  for  all  alternatives,  and  the  one 
with  minimal  cost  is  selected.  Therefore,  at  each  step  the  “degree  of  redundancy”  of 
the  solution  increases.  The  optimal  solution  from  the  non-redundant  optimization 
model  is  the  basis  for  the  first  iteration. 

2  The  algorithm  can  be  classified  as  a  greedy  heuristic,  because  at  each  step 
the  variable  V  is  selected  which  decreases  the  overall  costs  the  most. 

3  The  algorithm  is  convergent  toward  a  relative  minimum,  as  the  overall 
cost  monotonically  decreases  with  progressively  determined  solutions.  In  fact,  the 
algorithm  terminates  when  it  is  not  possible  to  decrease  the  overall  cost  any  further. 

4  The  reliability  benefit  accrued  by  having  multiple  copies  is  not  attributed 
to  any  particular  transaction,  but  rather  to  a  solution. 

The  algorithm  compares  alternative  solutions  S  by  associating  to  them  a  global 
cost  C(S)  which  is  based  on  the  cost  of  transaction  execution  and  the  benefits 
accruing  from  the  increasing  reliability  due  to  the  introduction  of  redundancy. 

The  object  i  whose  allocation  variable  value  is  changed  from  0  to  1  in  S'  divides 
the  transactions  into  two  sets.  The  first  one  consists  of  those  transactions  which  use 
Oi(i  €  {/(*)),  whose  execution  cost  has  to  be  evaluated.  The  second  one  consists 
of  those  transactions  which  do  not  use  0,(*  &U(k)),  whose  execution  costs  does 
not  change  from  the  previous  iteration.  Clearly,  by  storing  individual  transaction 
execution  costs  corresponding  to  the  current  solution  at  each  iteration,  these  costs 
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need  not  be  evaluated  for  transactions  of  the  second  set.  The  transaction  execution 
component  of  the  global  cost  C'  is  evaluated  by  summing  the  contributions  from 
the  transactions  of  both  sets  (see  Fig.  9). 

The  reliability  benefit  can  be  modeled  as  a  function  of  dr the  number  of 
copies  of  each  object  t  in  the  considered  solution.  Realistically,  the  benefit  increases 
with  dr,  in  a  non-linear  way;  in  fact,  while  the  introduction  of  the  first  copy  is 
highly  beneficial,  the  interest  in  having  the  (dr*  +  l*fc)  copy  of  the  same  information 
decreases  with  dr,-.  A  possible  function  to  model  this  property  is: 

/(dr.)  =  (1  -  2 

where  Di  is  the  benefit  of  having  the  object  i  infinitely  redundant;  note  that  /( 1)  = 
0,  /( 2)  =  1/2  Bit  /( 3)  =  3 /iBi,  and  so  on.  The  benefits  due  to  replication  are 
taken  into  account  by  summing  the  values  returned  by  the  above  functions  for  all 
objects  in  S'  (see  Fig.  C). 

At  each  iteration,  Sne„  corresponds  to  the  current  “best”  candidate  between  the 
alternative  S'  solutions  generated  from  S;C,C'  and  CntvJ  denote  the  corresponding 
global  costs.  The  algorithm  terminates  when  none  of  the  candidate  solutions  S' 
yields  a  global  cost  which  is  less  than  the  current  global  cost,  and  therefore  Cn<«  < 
C  is  not  satisfied. 


5.6  Alternative  Heuristic  Formulations 

The  heuristic  proposed  above  is  “conservative”,  in  the  sense  that  decisions  taken 
at  each  step  involve  the  repetition  of  all  cost  evaluations  concerning  those  objects 
whose  allocation  is  modified  from  the  previous  step.  In  that  sense  the  proposed 
heuristic  is  “sound”  (decisions  are  always  based  on  correct  evaluations),  but  it  is 
also  rather  hard  from  a  computational  viewpoint.  As  already  shown,  the  computa¬ 
tion  of  transaction  execution  costs  grow  combinatorially  with  the  number  of  objects 
retrieved  by  the  transaction  itself;  therefore  the  complexity  of  an  iteration  decreases 
linearly  as  the  algorithm  evolves  because  of  the  reduction  of  variables  to  be  con¬ 
sidered  as  candidates,  but  the  complexity  of  transaction  cost  evaluation  increases 
combinatorially  with  the  degree  of  redundancy  or  objects  in  the  base  solution. 

In  some  cases,  the  dimensions  of  the  database  design  problem  are  such  that  the 
computational  complexity  of  the  proposed  heuristic  U  too  hard;  then  the  proposed 
heuristic  is  a  good  basis  for  building  Taster”  heuristics,  which  sacrifice  the  accuracy 
of  the  final  result  in  order  to  avoid  hard  computations.  In  the  following,  such  a 
faster  heuristic  is  presented  which  has  the  important  property  of  being  convergent 
toward  a  relative  minimum. 

The  algorithm  consists  of  the  following  steps: 

step  1  Iteration  1  of  the  original  algorithm  is  performed,  and  all  the  candidate 
solutions  S'  corresponging  to  a  cost  C'  which  improve  the  cost  C*  of  the  optimal 
non-redundant  solution  are  retained;  these  solutions  are  ranked  in  descending  order 
by  value  of  the  associated  cost  C'  (hence,  the  first  solution  correspond  to  the  more 
convenient  variable  to  be  set  at  1). 
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step  2  Then,  variables  are  tentatively  set  to  1  according  to  the  ranking  order, 
thus  increasing  progressively  the  degree  of  redundancy  of  the  solution.  At  each 
iteration,  it  is  verified  that  the  global  cost  has  decreased  with  respect  to  the  previous 
iteration;  however,  this  is  done  for  the  considered  variable  only. 

step  3  If  the  above  verification  fails,  then  it  is  possible  either  to  terminate 
the  process  or  to  repeat  Step  1  with  the  current  solution  as  a  basis,  rank  variables 
according  to  their  convenience,  and  then  proceed  with  Step  2.  In  this  case,  the 
optimization  process  terminates  when  Step  1  is  performed  without  finding  any  new 
solution  which  decreases  the  cost  of  the  current  one. 


5.7  Introduction  of  Concurrency  Control  Costs  in  the  Design 

Synchronization  costs  can  be  introduced  in  the  design  of  the  redundant  distribution 
of  a  database  to  take  into  account  the  increase  in  complexity  of  concurrency  control 
due  to  the  presence  of  redundant  copies,  which  require  updating  within  transactions. 
An  additional  parameter  CM  is  introduced  which  measures  the  cost  of  transmitting 
one  message  between  different  sites.  Concurrency  control  is  also  necessary  for 
databases  without  replication  and  for  auditable  retrieval  transactions  as  well  as  for 
update  operations.  The  coefficient  CC(Tk,  S')  will  evalute  the  amount  of  overhead 
with  respect  to  a  non-redundant  execution  of  the  same  transaction.  Since  our  model 
aims  to  be  general  no  particular  schema  is  assumed  for  the  implementation  of  the 
concurrency  protocol,  and  we  consider  that  synchronization  overhead  is  proportional 
with  the  number  of  copies  to  be  updated.  This  number  is  furthermore  proportional 
to  the  number  of  fragments  per  partitioned  object  Np,  when  the 
not  match  the  partition.  We  have  therefore 

CC(Tk,S')=  £  +  +  E 

»et/(*)A  \  p  j 

«‘=.i 

In  selecting  the  best  alternative  Tor  transaction  cxcution  the 
should  be  added  to  AC(Tk,S')  and  TC(Tk,S'). 


transaction  does 

YiAcM 
term  CC(Tk,S') 


5.8  Introduction  of  Storage  Limitations  in  the  Design 

Storage  limitations  at  each  database  site  may  also  be  introduced  into  the  model.  In 
modern  systems  storage  costs  may  become  a  minor  component  of  overall  system  cost. 
However,  with  replication  of  data  it  may  be  useful  to  include  in  the  optimization 
constraints  which  account  for  storage  limitations  in  order  not  to  exceed  available 
storage  at  each  site.  Such  a  constraint  is 

^  I  J!  Xi*  p'  »)  +  Y<J  ]•*«(*)  <  SCj  for  every  8ite  3 
where  SC,  measures  the  storage  capacity  at  site  j. 
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6  EXAMPLE 

An  example  of  an  application  of  the  optimal  non-rcdundant  allocation  model  is 
shown  in  Figs.  10  and  11.  The  example  is  quite  simple,  but  it  incorporates  most  of 
the  features  which  are  typical  of  a  database  distribution  problem. 


6.1  Description  of  Requirements 

The  requirements  for  the  optimization  model  are  described  in  Fig.  10a,  b,  and 
c.  We  are  considering  a  fully-connected,  distributed  database  consisting  of  three 
sites.  The  database  schema  which  has  to  be  distributed  (Fig.  10a)  consists  of  three 
objects  (DEPARTMENT,  EMPLOYEE,  and  PROJECT)  and  three  links  (DEPT-PROJ,  DEPT- 
EMPL,  and  PROJ-EMPL).  Quantitative  parameters  and  the  index  numbers  associated 
with  objects  and  links  are  also  shown  in  the  figure.  Note  that  the  link  between 
DEPARTMENTS  and  PROJECTS  is  included  in  the  logical  schema,  but  is  never  used 
by  the  transactions  which  are  considered;  hence  it  is  not  subjected  to  distribution 
optimization. 

The  candidate  partitionings  are  shown  in  Fig.  10b.  Partitionings  1  and  2  are 
primary  on  DEPARTMENT  and  derived  on  EMPLOYEE  via  link  DEPT-EMP;  partitioning 

3  is  primary  on  PROJECT  amd  derived  pm  EMPLOYEE  via  link  PROJ-EMP;  partitioning 

4  is  primary  on  EMPLOYEE.  Because  of  these  definitions,  the  object  EMPLOYEE  can 
be  partitioned  according  to  4  alternative  candidates  (3  derived  partitionings  and  1 
primary  partitioning);  consequently,  variables  zn,  zi2,  Zi3,  and  zu  arc  introduced. 
The  object  DEPARTMENT  can  be  partitioned  according  to  2  alternative  candidates 
(both  primary);  correspondingly,  variables  %%\  and  z22  are  introduced.  Finally,  the 
object  PROJECT  can  be  partitioned  accordingly  to  only  1  candidate  partition,  which 
is  primary  (variable  z33). 

The  'transactions  are  shown  in  Fig.  10c;  quantitative  parameters  are  described 
through  Transaction  Specification  Tables,  as  in  Fig.  7.  Transaction  1  is  issued  at 
Site  SI  and  matches  predicate  p4.  Therefore,  if  the  employee  OBJECT  accessed  by 
the  transaction  were  partitioned  according  to  p4  in  the  optimal  solution  determined 
by  the  optimization  model,  all  accesses  of  T1  will  be  local  to  Site  1.  (This,  of  course, 
will  also  occur  if  Object  1  is  allocated  at  Site  1  as  a  whole).  Transaction  2  can  be 
issued  from  all  the  sites,  but  it  matches  predicate  pi  when  it  is  issued  from  Site  1. 
Transaction  4  can  be  issued  from  Sites  1  and  2  and  in  both  cases  matches  partition 
P2. 


6.2  The  Optimisation  Model 

The  optimization  model  for  the  example  involves  the  use  of  25  variables;  in  Tact, 
there  arc  7  Xip  variables  (corresponding  to  4  primary  and  3  derived  partitionings), 
3  Whp  variables  (for  the  links  along  which  derived  partitions  are  propagated),  9 
variables  ytJ  (each  object  can  be  allocated  on  each  node  on  a  whole),  6  variables 
(2  links  are  used  by  transactions,  and  can  be  local  to  each  site).  Moreover, 
21  constraints  arc  introduced  (3  non-redundancy  constraints,  and  18  constraints 
for  modelling  dependencies  between  variables  W*p  and  XiP  or  V*/  and  V,y.  The 
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optimization  was  made  using  the  Additive  Generalized  Baias  Algorithm,  which  is  a 
general,  0-1  integer  linear  solution  method;  the  program  was  taken  from  [BaCC81]. 
The  CPU  time  used  for  an  optimization  run  was  between  1  and  3  seconds  on  a 
DEC- 20/60  system. 


6.3  Discussion  of  Results 

Figure  11  shows  seven  results  of  the  optimization,  obtained  by  varying  the  cost 
parameters  and  the  transaction  frequencies.  For  simplicity,  in  the  case  of  multiple 
site  transaction,  the  same  frequency  value  is  assigned  to  each  version.  A  solution  is 
represented  in  the  table  by  the  variables  which  are  set  to  1;  object  or  link  variables 
are  shown  in  separate  columns. 

Cases  1,  2,  and  3  show  the  effect  of  increasing  transmission  cost.  With  null 
transmission  cost  [TC  =  0),  each  object  is  allocated  by  itself,  and  no  Vj,,  or 
Wf,p  variable  appears  in  the  solution  (as  join  transmission  costs  arc  not  evaluated). 
However,  by  increasing  transmission  cost,  the  object  allocation  moves  to  site  1.  This 
is  because,  in  the  example,  most  transactions  arc  issued  from  site  1,  and  therefore 
this  solution  maximizes  the  locality  of  processing. 

Cases  4  to  7  show  the  effect  of  increasing  one  of  the  transaction  frequency  values 
in  turn.  Cases  4  and  5  still  maintain  the  allocation  of  objects  on  site  1,  because 
transactions  1  and  2  are  issued  from  1 .  However,  case  6  presents  the  partitionings 
of  objects  1  and  3,  which  are  used  by  transaction  3,  according  to  partitioning  3, 
which  matches  the  transaction.  Likewise,  case  7  presents  the  partitioning  of  objects 
1  and  2  according  to  partitioning  2,  which  is  matched  by  transaction  4. 

In  case  8,  the  access  costs  are  made  equal  to  1,  without  distinguishing  local 
versus  remote  and  retrieval  versus  update  accesses.  Then,  the  objects  move  to  site 
3,  where  most  of  retrievals  take  place. 

Finally,  cases  9  and  10  show  the  effect  of  increasing  the  access  cost  at  site  1. 
In  case  9,  the  objects  move  to  site  2;  in  case  10,  the  frequency  of  transaction  3, 
issued  from  site  3,  is  increased,  and  consequently  the  objects  move  to  site  3.  While 
the  behavior  of  the  allocation  optimizer  can  be  easily  understood  and  connected  a 
posteriori,  the  allocation  chosen  by  the  model  is  not  at  all  obvious  a  priori. 
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7  CONCLUSIONS  AND  FUTURE  WORK 

In  this  report  we  have  dealt  with  the  problem  of  distributing  a  database  by  con¬ 
sidering  the  logical  schema  of  the  database  consisting  of  objects  and  predefined 
links,  where  individual  objects  were  in  the  form  of  normalised  relations.  •  By  defining 
four  basic  logical  access  types  and  using  a  rather  straightforward  model  of  trans¬ 
action  execution,  we  were  able  to  develop  necessary  equations  to  estimate  costs  of 
transaction  processing  and  develop  an  optimisation  model  to  minimise  the  costs.  A 
decomposition  model  was  developed  to  make  the  problem  computationally  feasible 
and  a  heuristic  procedure  was  discussed  to  incorporate  redundant  allocation  of  en¬ 
tire  objects  or  partitioned  objects. 

This  paper  has  developed  a  methodology  for  the  distribution  design  phase 
(see  Fig.  1)  which  fits  in  the  overall  framework  of  database  design.  This  phase 
is  into  a  series  of  activities  which  are  necessary  before  the  optimization  model  can 
in  fact  be  applied.  An  important  early  step  is  the  solicitation  of  partioning  guidance 
from  the  user,  which  makes  this  model  tractable.  The  example  in  Section  6  points 
out  the  scenario  of  a  distribution  design  by  considering  various  possible  mixes  of 
transactions. 

It  is  conceivable  that  after  a  database  has  been  distributed  and  is  operational 
a  restructuring  is  called  for,  due  to  the  following  reasons: 

i  Better  transaction  load  estimates  are  available. 

it  There  is  a  need  to  introduce  new  objects  and  links  in  the  database  schema 
and  repopulate  the  database. 

tit  Cost  parameters  such  as  access  costs  and  transmission  costs  have  undergone 
a  change. 

The  approach  that  can  be  taken  to  deal  with  the  above  problem  is  to  run  a 
revised  optimization  model.  The  revision  consists  of  adding  to  each  Cip  or  Dij  the 
cost  of  moving  the  already  allocated  object  to  the  intended  new  location.  This  cost 
should  be  averaged  over  the  time  period  between  two  restructuring  since  all  other 
costs  are  per  unit  time.  The  non-redundant  model  could  then  be  run  with  these 
new  cost  parameters. 

Redoing  the  problem  under  redundancy  amounts  to  solving  the  old  and  the 
new  problems  with  new  parameters  and  advocating  restructuring  if 

“  Cnc«*  >  p 

where,  C0u  and  Cntw  are  total  costs  of  transaction  processing  for  the  old  and  new 
allocations  using  new  parameters. 

R  is  the  cost  of  a  one-time  restructuring 

P  is  the  time  unite  between  two  restructurings 
A  sophisticated  database  environment  with  a  built-in  design  tool  which  is  capable  of 
doing  the  above  type  of  restructuring  can  be  expected  to  monitor  loads  and  trigger 
a  distribution  of  data  over  the  network  when  needed. 
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Future  extensions  to  this  work  will  address: 

»  A  vertical  partitioning  of  objects. 

tt  Different  models  for  transaction  execution. 

Hi  More  general  models  for  description  of  data  replication;  e.g.  the  effectiveness 
of  redundantly  allocating  a  single  fragment. 

tv  Consideriation  of  equivalent  logical  schemas  for  further  optimisation;  e.g. 
when  a  subproblcm  is  decomposed,  alternative  logical  schema  representations  or  the 
subproblem  should  be  investigated. 

The  current  model  is  encouraging  since  it  permits  a  formal  solution  to  a  design 
problem  which  is  too  complex  to  be  solved  by  random  search  and  for  which  no 
good  directed  search  algorithms  are  known.  At  the  same  time  few  people  have 
the  experience  to  design  distributed  databases  by  intuition,  although  the  results 
obtained  from  testing  our  model  were  explainable  in  informal  terms. 
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Fiflura  1  The  Overall  Distributed  Database  Design  Methodology 
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CUSTOMER 


Cust#, . 

ACCOUNT 

Acer#,  Custi 

ft. . 

-  (a)  a  ona-many  relationship  between 
CUSTOMER  and  ACCOUNT 


CUSTOMER  ACCOUNT 


(b)  a  many-many  relationship  between 
CUSTOMER  and  ACCOUNT 


(c)  a  ternary  relationship  between  PROJECT, 
.  PROGRAMMER  and  COMPUTER 


Figure  2  Use  of  Objects  end  Links  to  Model  Different  Types  of  Relationships 
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MED-INFO  PAYROLL 


Figure  3  An  Example  of  a  Databeee  Scheme 


See. 
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MED-DEPT 


(a)  Database  Schema 

MED  MED 


Partitioning  Predicate  for  MED-DEPTS  Partitioning  Predicate  for  MED-DEPTS 

is  based  on  Location  is  based  on  ranges  of  D# 


PROC 


Figure  4  A  Database  Schema  and  Its  Possible  Derived  Partitioning  Hierarchies 
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TDA:  Total  Direct  Access, 
no.  accessed  -  card  (i) 


SDA:  Selective  Direct  Access. 

no.  accessed  is  user  specified 


TJA:  Total  Join  Access. 

no.  accessed  *  N  *  ratio,  if  j  it 

the  member  of  h 
*  N,  if  j  is  the  owner 
of  h 


SJA:  Selective  Join  Access. 

no.  accessed  is  user  specified 


Figure  5  Transaction  Access  Primitives 


S«c. 
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Transaction:  Find  all  EMPnployeee  in  the  DEPartmenta  in  CAliiornia 

that  have  PROJects  which  need  PART#=  7386. 
Liat  EMP* ,  DEPT*.  PROJi.  BUDGET. 


EMPLOYEE 


P*rt-J7  -  7386 


Figure  6  A  Graphical  Transaction  Specification 


42 


Optimal  Design  of  Distributed  Databases 


TRANSACTION  SPECIFICATION 


Object 

Used  as 
Entry 
Point 

Used 

for 

Result 

No. of 
Acces'd 
Tuples 

Link 

Used 

Next 

Partn’g 

Pred. 

Matched 

m 

Retrvl 

vs 

Update 

Evaluated 
no.  Bytes 
X'mitted 

part 

1 

0 

1 

P-PH 

- 

4 

R 

4 

part-needs 

0 

0 

16 

PN-PR 

- 

4 

R 

64 

project 

0 

0 

16 

PR-D 

PI 

12 

R 

192 

department 

0 

0 

10 

D-E 

PI 

12 

R 

1012=120 

employee 

0 

1 

200 

PI 

16 

R 

200 

Origin  at:  Site  3 

Freq33:  50/month 

Figure  7  A  Tabular  Transaction  Specification 
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Let  S  =  vu,i/i3,yii, yij  represent  a  redundant  allocation  of  a  database  having  3 
sites  and  2  objects,  where  all  non* mentioned  variables  are  set  to  2ero.  Cases 
a,  b,  c,  d  of  Section  5.2  are  repreaented  below  with  diagrams,  where  solid 
dots  correspond  to  objects  in  SOL(S,/* );  arrows  indicate  precedences  between 
transaction  accesses;  and  a  vertical  arrow  connecta  accesses  performed  at  the 
same  site,  which  do  not  require  any  transmiasion. 

Case  a:  Retrieval-Retrieval/ 

lk  —  { OitO»),SOHS,Ik )  ■»  {{yiiitoi}»{yiiiVi*}t{vt*iVii}i{vi*tyat}} 

A  different  transaction  execution  is  obtained  from  each  solution  in  SOl{S,  /*); 


Case  b:  Retrieval-Update 

Ik  —  {Oi}  =  SOL(S,Ik)  =  {yisiViii y**}} 

'  Transmissions  to  both  copies  of  the  updated  object  are  required. 


Case  c:  Update-Retrieval 

/*  =  {0*}  =  SOL{S,Ik)  =*  {{yu,yij,yti},  {viiiViii V22}} 

Transmission  is  required  from  one  of  the  copies  of  the  updated  object  to  the 
retrieved  object;  the  best  transmission  alternative  is  selected. 


Case  dt  Update-Update 

lh  —  4tSOIj(S,Ik)  =  {{yii,yu,yt!,ysj}} 

Transmission  to  both  copies  of  Object  2  are  required  for  both  copies;  the  best 
alternative  is  selected. 


Figure  8  Transaction  Execution  over  a  Redundantly  Distributed  Database 
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Start  with  Sn*w  —  S  ,  Cm*  —  C 

(*  refers  to  the  optimal  non* redundant  solution) 


repeat 

S  !=  Sntimi  C  :=  Ciuui 

for  every  variable  Xip  or  YiP  which  does  not  belong  to  5  do 
begin  «•> 

build  S'  ==  (5  +that  variable  set  to  1); 

C'  :=  0;  • 

for  every  transaction  Tk  such  that  *  €  U{Tk)  do 
begin 

compute  CT(Tk,  S'); 

C'  :=  C'  +  CT(Tk,S'); 

end; 

for  every  transaction  Tk  such  that  t  §?U(Tk)  do 
C'  :=  C'  +  CT(Tk,S)i 
C':=C'  +  £.  /(dr,); 
if  C'  <  Cn«w  then 

begin  5...  :=  S';  Cn*»  .**»  &  end 

end 


until  (CM.  <  C)  ; 


5,  S',  Sn«w  represent  assignments  of  variables  Xij  and  corresponding  to  old, 
partial  and  new  solution  at  each  iteration.  C,  C',  Cn,„  are  the  corresponding  total 
costs.  The  counts  dr,  arc  the  degree  of  redundancy  of  object  t  in  solution  S. 


Figure  9  Algorithm  for  Redundant  Database  Distribution 


Sac. 
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(a)  Database  Schema 
DEPARTMENT 


DEPT-EMP 


Objcctname 


Department 


Emp#.  Proj #,  Dept#;  Name.  Status, 


(b)  Candidate  Partitionings 

PI:  primary  on  DEPARTMENT  (Object  2) 

pred{ 2, 1,1):  DNO  in  {1. .  .20);  alioc(2, 1, 1)  =  1;  /r(2, 1, 1)  =»  0.4 

prcd(2, 1, 2) :  DNO  in  (21. .  .50);  alloe( 2, 1, 2)  =  3;  fr( 2, 1, 2)  =  0.S 

derived  on  EMPLOYEE  via  link  DEPT-EMP 

P2:  primary  on  DEPARTMENT 

pred{ 2, 2,1):  L0CATI0N=Nortfaern  California;  ottoc(2, 2, 1)  =  1;  /r(2, 2,  l) 
prcd(2, 2, 2) :  L0CATI0N=Central  California  ;  olloc(2, 2, 2)  »  2;  /r(2, 2, 2) 
pred( 2, 2, 3) :  L0CATI0N=Southern  California;  alloc(2, 2, 3)  =  3;  fr{ 2, 2, 3) 
derived  on  EMPLOYEE  via  link  DEPT-EMP 

P3:  primary  on  PROJECT 

pred( 3, 3, 1) :  TYPE=Sof  tware;o//oc(3, 3, 1)  =  1;  fr[ 3, 3, 1)  =  0.7 
pred(3, 3, 2) :  TYPE=Hardware;  a//oc(3, 3, 2)  =  3;  fr( 3, 3, 2)  =  0.3 
derived  on  EMPLOYEE  via  link  PR0J-EMP 

P4:  primary  on  EMPLOYEE 

pred(  1, 4, 1) :  STATUS=Regular  ;  o«oc(l,  4, 1)  *  2;  /r(l,  4, 1)  =  0.5 
pred(  1, 4, 2) :  STATUS=Part-tine;  alloe{  1, 4, 2)  =  2;  /r(l,  4, 2)  =  0.2 
pred(  1, 4, 3) :  STATUS*Pired  ;  allocll,  4, 3)  -  3;  fril,  4, 3)  —  0.3 


Figure  10  Example  of  a  Database  Schema,  Candidate  Partitionings,  and  Transactions 
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(c)  Transactions 


Tl:  Give  5%  raise  to  salaries  of  regular  employees 
ORIGIN:  Site  1  (FREQUENCY:  frequ) 


Object 

Used  as 
Entry 
Point 

Used 

for 

Result 

No. of 
Accus'd 
Tuples 

Link 

Used 

Next 

Partn'o 

Pred. 

Matched 

n 

Retrvl 

vs 

Update 

Evaluated 
no.  Bytes 
X'mitted 

EMPLOYEE 

Yes 

Yes 

2250 

— 

P4 

- 

U 

— 

T2:  List  name  and  salary  of  employees  with 

department  city  =  San  Francisco 

ORIGIN:  Site  1  (FREQUENCY:  Jrtqtt) 


DEPARTMENT 

EMPLOYEE 

Yes 

Yes 

5 

450 

DEPT- 

EMP 

Pi 

Pi 

m 

H 

20 

1800 

ORIGIN:  Site  2  (FREQUENCY:  frequ)  and  Site  3  (FREQUENCY:  frequ) 

DEPARTMENT 

Yes 

mm 

6 

1 0  ^  w 

■a 

mm 

20 

EMPLOYEE 

E9 

450  1 

■9 

H 

1800 

T3:  List  all  participants  to  hardware  projects  whose  manager  is  Jones 
ORIGIN:  Site  3  (FREQUENCY:  frequ) 


PROJECT 

Yes 

20 

P3 

mm 

n 

80 

EMPLOYEE 

Yes 

45 

El 

1  P3 

■El 

Mi 

180 

T41:  Give  a  &%  raise  to  employee  salaries 

for  the  departments  in  Northern  California 

ORIGIN:  Site  1  (FREQUENCY:  frequ) 


DEPARTMENT 

Yes 

13 

DEPT-I 

P2 

4 

R 

52 

EMPLOYEE 

Yes 

1170 

P2 

- 

U 

— 

T42:  Give  a  5%  raise  to  employee  salaries 

for  the  departsients  in  Central  California 

ORIGIN:  Site  2  (FREQUENCY:  frequ) 


DEPARTMENT 

Yes 

16 

DEPT-i 

P2 

4 

R 

60 

EMPLOYEE 

Yes 

1350 

P2 

- 

U 

— 

Figure  10  Example  of  s  Database  Schema,  Partitionings,  and  Transactions 
(continued) 
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TABLE  OF  RESULTS 


Case 

Cost 

Transaction 

Optimal 

Cost 

Parameters 

Frequency 

Solutions 

CLR,  CRR,  CLU,  CRU 

TC 

/f.,.4 

y  and 

W 

1 

0 

/»- 

Vu,ss2,yis 

- 

33400.50 

2 

0.5 

/*  =  /»« 

33553.00 

3 

CLR=\,CRR=Z.b, 

/*  =  1 

Vu,V2i,y*t 

toil. 

4 

CLU=5,CRU=10 

/i  =  100,  =  1 

1012 

1147403.00 

5 

ft  —  100,  f  1,1,4  =  1 

Vu,yii,yii 

305613.25 

6 

ft  =  100,  /»,*,«  =  1 

sis,yas,st> 

40016.00 

7 

1 

ft  =  100, —  1 

*12,*22,V2S 

E29 

1280025.00 

8 

CLR=CRR— 

CLU—CRU=\ 

■xan 

nGDSH 

Vl»,Vt*>  Vi* 

toil, 

toil 

6243.00 

8 

CLRi  =  10,CRRt= 25, 
CLUi=S0,CRUi=100 

■mn 

yi*,yxt,yss 

Off! 

Eg 

43010.00 

10 

CLRi,  t = 1  ,CRRt.t  =25, 
CLUi=5,C«t/i=100 

■nsni 

l/EtxEV/JSI 

yii,yu,yss 

B? 

50702.00 

Figure  11  Optimal  Non-Redundant  Solution  for  Several  Values  of  the  Cost  Parameters 
and  the  Transaction  Frequencies 


